Search CORE

28 research outputs found

Development of Energy Models for Design Space Exploration of Embedded Many-Core Systems

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Klarhorst Christian
Porrmann Mario
Rückert Ulrich
Publication venue
Publication date: 01/01/2018
Field of study

This paper introduces a methodology to develop energy models for the design space exploration of embedded many-core systems. The design process of such systems can benefit from sophisticated models. Software and hardware can be specifically optimized based on comprehensive knowledge about application scenario and hardware behavior. The contribution of our work is an automated framework to estimate the energy consumption at an arbitrary abstraction level without the need to provide further information about the system. We validated our framework with the configurable many-core system CoreVA-MPSoC. Compared to a simulation of the CoreVA-MPSoC on gate level in a 28nm FD-SOI standard cell technology, our framework shows an average estimation error of about 4%.Comment: Presented at HIP3ES, 201

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

System-Level Analysis of Network Interfaces for Hierarchical MPSoCs

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Porrmann Mario
Sievers Gregor
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Ax J, Sievers G, Flasskamp M, Kelly W, Jungeblut T, Porrmann M. System-Level Analysis of Network Interfaces for Hierarchical MPSoCs. In: Proceedings of the 8th International Workshop on Network on Chip Architectures (NoCArc). New York, NY, USA: ACM; 2015: 3-8.Network Interfaces (NIs) are used in Multiprocessor System-on-Chips (MPSoCs) to connect CPUs to a packet switched Network-on-Chip. In this work we introduce a new NI architecture for our hierarchical CoreVA-MPSoC. The CoreVA-MPSoC targets streaming applications in embedded systems. The main contribution of this paper is a system-level analysis of different NI configurations, considering both software and hardware costs for NoC communication. Different configurations of the NI are compared using a benchmark suite of 10 streaming applications. The best performing NI configuration shows an average speedup of 20 for a CoreVA-MPSoC with 32 CPUs compared to a single CPU. Furthermore, we present physical implementation results using a 28 nm FD-SOI standard cell technology. A hierarchical MPSoC with 8 CPU clusters and 4 CPUs in each cluster running at 800 MHz requires an area of 4.56 mm²

Publications at Bielefeld University

An Abstract Model for Performance Estimation of the Embedded Multiprocessor CoreVA-MPSoC Technical Report (v1.0)

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Klarhorst Christian
Sievers Gregor
Publication venue
Publication date: 01/01/2015
Field of study

Ax J, Flasskamp M, Sievers G, Klarhorst C, Jungeblut T, Kelly W. An Abstract Model for Performance Estimation of the Embedded Multiprocessor CoreVA-MPSoC Technical Report (v1.0).; 2015

Publications at Bielefeld University

Performance Estimation of Streaming Applications for Hierarchical MPSoCs

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Klarhorst Christian
Porrmann Mario
Sievers Gregor
Thies Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Flasskamp M, Sievers G, Ax J, et al. Performance Estimation of Streaming Applications for Hierarchical MPSoCs. In: Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO). New York, NY: ACM Press; 2016: 1

Publications at Bielefeld University

CoreVA-MPSoC: A Many-core Architecture with Tightly Coupled Shared and Local Data Memories

Author: Ax Johannes
Daberkow Julian
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Porrmann Mario
Rückert Ulrich
Sievers Gregor
Vohrmann Marten
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Ax J, Sievers G, Daberkow J, et al. CoreVA-MPSoC: A Many-core Architecture with Tightly Coupled Shared and Local Data Memories. IEEE Transactions on Parallel and Distributed Systems. 2018;29(5):1030-1043

Publications at Bielefeld University

System-level analysis of network interfaces for hierarchical MPSoCs

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Porrmann Mario
Sievers Gregor
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Network Interfaces (NIs) are used in Multiprocessor System-on-Chips (MPSoCs) to connect CPUs to a packet switched Network-on-Chip. In this work we introduce a new NI architecture for our hierarchical CoreVA-MPSoC. The CoreVA-MPSoC targets streaming applications in embedded systems. The main contribution of this paper is a system-level analysis of different NI configurations, considering both software and hardware costs for NoC communication. Different configurations of the NI are compared using a benchmark suite of 10 streaming applications. The best performing NI configuration shows an average speedup of 20 for a CoreVA-MPSoC with 32 CPUs compared to a single CPU. Furthermore, we present physical implementation results using a 28 nm FD-SOI standard cell technology. A hierarchical MPSoC with 8 CPU clusters and 4 CPUs in each cluster running at 800MHz requires an area of 4.56mm2

Crossref

Queensland University of Technology ePrints Archive

Evaluation of interconnect fabrics for an embedded MPSoC in 28 nm FD-SOI

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Kucza Nils
Porrmann Mario
Ruckert Ulrich
Sievers Gregor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Embedded many-core architectures contain dozens to hundreds of CPU cores that are connected via a highly scalable NoC interconnect. Our Multiprocessor-System-on-Chip CoreVAMPSoC combines the advantages of tightly coupled bus-based communication with the scalability of NoC approaches by adding a CPU cluster as an additional level of hierarchy. In this work, we analyze different cluster interconnect implementations with 8 to 32 CPUs and compare them in terms of resource requirements and performance to hierarchical NoCs approaches. Using 28nm FD-SOI technology the area requirement for 32 CPUs and AXI crossbar is 5.59mm2 including 23.61% for the interconnect at a clock frequency of 830 MHz. In comparison, a hierarchical MPSoC with 4 CPU cluster and 8 CPUs in each cluster requires only 4.83mm2 including 11.61% for the interconnect. To evaluate the performance, we use a compiler for streaming applications to map programs to the different MPSoC configurations. We use this approach for a design-space exploration to find the most efficient architecture and partitioning for an application

Queensland University of Technology ePrints Archive

Development of energy models for design space exploration of embedded many-core systems

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Klarhorst Christian
Porrmann Mario
Ruckert Ulrich
Publication venue: 'Cornell University Library'
Publication date: 01/01/2018
Field of study

Queensland University of Technology ePrints Archive

Development of Energy Models for Design Space Exploration of Embedded Many-Core Systems

Author: Ax Johannes
Flasskamp Martin
Jungeblut Thorsten
Kelly Wayne
Klarhorst Christian
Porrmann Mario
Rückert Ulrich
Publication venue
Publication date: 01/01/2018
Field of study

Klarhorst C, Flasskamp M, Ax J, et al. Development of Energy Models for Design Space Exploration of Embedded Many-Core Systems. Presented at the 6th International Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES 2018), Manchester, United Kingdom

Publications at Bielefeld University

Scalable mapping of streaming applications onto MPSoCs using optimistic mixed integer linear programming

Author: Ax Johannes
Flasskamp Martin
Gayen Neela
Jungeblut Thorsten
Kelly Wayne
Klarhorst Christian
Tang Maolin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/06/2018
Field of study

Embedded streaming applications are facing increasingly demanding performance requirements in terms of throughput. A common mechanism for providing high compute power with a low energy budget is to use a very large number of low-power cores, often in the form of a Massively Parallel System on Chip (MPSoC). The challenge with programming such massively parallel systems is deciding how to optimally map the computation to individual cores for maximizing throughput. In this work we present an automatic parallelizing compiler for the StreamIt programming language that efficiently and effectively maps computation to individual cores. The compiler must be both effective, meaning that it does a good job of optimizing for throughput; but also efficient, in that the time taken to find such a mapping must scale well as the number of cores and size of the Stream program increases. We improve on previous work that used Integer Linear Programming (ILP) to map StreamIT programs to multicore systems by formulating the mapping problem in a different way using mostly real rather than integer variables. Using so called Mixed Integer Linear Programming (MILP) dramatically reduces the cost compared to standard ILP. This alternative formulation creates what we call an optimistic solution that we then need to adjust slightly to obtain a final feasible solution. We show that this new approach is always close, if not better in terms of effectiveness, while being dramatically better in terms of scalability and efficiency

Crossref

Queensland University of Technology ePrints Archive